Correcting a Significance Test for Clustering

نویسنده

  • Larry V. Hedges
چکیده

A common mistake in analysis of cluster-randomized trials is to ignore the effect of clustering and analyze the data as if each treatment group were a simple random sample. This typically leads to an overstatement of the precision of results and anticonservative conclusions about the precision and statistical significance of treatment effects. This working paper gives a simple correction to the t-statistic that would be computed if clustering were (incorrectly) ignored. The correction is a multiplicative factor depending on the total sample size, the cluster size, and the intraclass correlation p. The corrected tstatistic has a student’s t-distribution with reduced degrees of freedom. The corrected statistic reduces to the t-statistic computed by ignoring clustering when p = 0. It reduces to the t-statistic computed using cluster means when p = 1. If 0 < p <1, it lies between these two, and the degrees of freedom are between those corresponding to these two extremes. Correcting a Significance Test for Clustering 3 Correcting a Significance Test for Clustering Field experiments often assign entire intact groups (such as sites, classrooms, or schools) to the same treatment group, with different intact groups assigned to different treatments. Because these intact groups correspond to clusters, this design is often called a group randomized or cluster randomized design. Several analysis strategies for cluster randomized trials are possible, but the simplest is to carry out a two stage analysis. That is, to compute mean scores on the outcome (and all other variables that may be involved in the analysis) and carry out the statistical analysis as if the site (cluster) means were the data. If all cluster sample sizes are equal, this approach provides exact tests for the treatment effect, but the tests may have lower statistical power than would be obtained by other approaches (see, e.g., Blair and Higgins, 1986). More flexible and informative analyses are also available, including analyses of variance using clusters as a nested factor (see, e.g., Hopkins, 1982) and analyses involving hierarchical linear models (see e.g., Raudenbush and Bryk, 2002). For general discussions of the design and analyses of cluster randomized experiments see Raudenbush and Bryk (2002), Donner and Klar (2000), Klar and Donner (2001), Murray (1998), or Murray, Varnell, & Blitstein (2004). A common mistake in analysis of cluster randomized trials is made when the data are analyzed as if the data were a simple random sample and assignment was carried out at the level of individuals. This typically leads to an overstatement of the precision of results and consequently to anti-conservative conclusions about precision and statistical significance of treatment effects (see Murray, Hannan, and Baker, 1996). This analysis can also yield misleading estimates of effect sizes and incorrect estimates of their Correcting a Significance Test for Clustering 4 sampling uncertainty. If the raw data were available, then reanalysis using more appropriate analytic methods is usually desirable. In some cases, however, the raw data is not available but it is desirable to be able to interpret the findings of a research report that improperly ignored clustering in the analysis. This problem often arises in reviewing the findings of studies carried out by other investigators. In particular, this problem has arisen in the work of the What Works Clearinghouse, a US Institute of Education Sciences funded project whose mission is to evaluate, compare, and synthesize evidence of effectiveness of educational programs, products, practices, and policies. What Works Clearinghouse reviewers found that, in the first areas they were investigating, the majority of the high quality studies involved assignment to treatments by clusters, but most of those studies did not account for clustering in their evaluation of the statistical significance of treatment effects. In this context, it would be desirable to be able to know how the conclusions about treatment effects might change if clustering were taken into account. The purpose of this paper is to provide an analysis of the effects of clustering on significance tests and confidence intervals for treatment effects. First we derive the sampling distribution of the t-statistic under a clustered sampling model with equal cluster sample sizes. The derivations provide some insight into the properties of suggestions that have appeared in the literature for adjusting significance tests for the effects of clustering. Then we provide a generalization for unequal cluster sample sizes. This research provides a simple correction that may be applied to a statistical test that was computed (incorrectly) ignoring the clustering of individuals within groups. The correction requires that a bound on the amount of clustering (in the form of an upper Correcting a Significance Test for Clustering 5 bound on the intraclass correlation parameter) is known or that the intraclass correlation parameter can be imputed for sensitivity analysis. We then derive confidence intervals for the mean difference based on the corrected test statistic. Finally we consider the power of the corrected test. Model and Notation Let Yij (i = 1, ..., m; j = 1, ..., ni) and Yij (i = 1, ..., m; j = 1, ..., ni) be the j observation in the i cluster in the treatment and control groups respectively, so that there are m clusters in the treatment group and m clusters in the control group, and a total of M = m + m clusters with n observations each. Thus the sample size in the treatment group is , 1 T m T T i i N n = = ∑ the sample size in the control group is , 1 C m C C i i N n = = ∑ and the total sample size is N = N + N. Let T i Y • (i = 1, ..., m ) and C i Y • (i = 1, ..., m ) be the means of the i cluster in the treatment and control groups, respectively, and let T Y•• and C Y•• be the overall (grand) means in the treatment and control groups, respectively. Define the (pooled) withintreatment group variance S via 2 2 1 1 1 1 2 2 T C T C i i n n m m T T C C ij ij i j i j (Y Y ) (Y Y ) S N •• •• = = = = − + − = − ∑ ∑ ∑ ∑ . (1) Correcting a Significance Test for Clustering 6 Suppose that observations within the treatment and control group clusters are normally distributed about cluster means μi and μi with a common within-cluster variance σW. That is 2 ( , ) T T ij i W Y N μ σ ∼ , i =1, ..., m ; j = 1, ..., ni

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A density based clustering approach to distinguish between web robot and human requests to a web server

Today world's dependence on the Internet and the emerging of Web 2.0 applications is significantly increasing the requirement of web robots crawling the sites to support services and technologies. Regardless of the advantages of robots, they may occupy the bandwidth and reduce the performance of web servers. Despite a variety of researches, there is no accurate method for classifying huge data ...

متن کامل

Correcting the stress-strain curve in hot compression test using finite element analysis and Taguchi method

In the hot compression test friction has a detrimental influence on the flow stress through the process and therefore, correcting the deformation curve for real behavior is very important for both researchers and engineers. In this study, a series of compression tests were simulated using Abaqus software. In this study, it has been employed the Taguchi method to design experiments by the factor...

متن کامل

ارائه یک روش فازی-تکاملی برای تشخیص خطاهای نرم‌افزار

Software defects detection is one of the most important challenges of software development and it is the most prohibitive process in software development. The early detection of fault-prone modules helps software project managers to allocate the limited cost, time, and effort of developers for testing the defect-prone modules more intensively.  In this paper, according to the importance of soft...

متن کامل

An Empirical Comparison of Distance Measures for Multivariate Time Series Clustering

Multivariate time series (MTS) data are ubiquitous in science and daily life, and how to measure their similarity is a core part of MTS analyzing process. Many of the research efforts in this context have focused on proposing novel similarity measures for the underlying data. However, with the countless techniques to estimate similarity between MTS, this field suffers from a lack of comparative...

متن کامل

Testing Independent Component Patterns by Inter-Subject or Inter-Session Consistency

Independent component analysis (ICA) is increasingly used to analyze patterns of spontaneous activity in brain imaging. However, there are hardly any methods for answering the fundamental question: are the obtained components statistically significant? Most methods considering the significance of components either consider group-differences or use arbitrary thresholds with weak statistical just...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005